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Cross-reactions and other systematic issues generated by the coupling of functional chemical subsystems pose 
the largest challenge for assembling a viable protocell in the laboratory. Our current work seeks to identify and 
clarify such key issues as we represent and analyze in simulation a full implementation of a minimal protocell. 
Using a 3D dissipative particle dynamics (DPD) simulation method we are able to address the coupled diffusion, 
self-assembly, and chemical reaction processes, required to model a full life cycle of the protocell, the protocell 
being composed of coupled genetic, metabolic, and container subsystems. Utilizing this minimal structural and 
functional representation of the constituent molecules, their interactions, and their reactions, we identify and 
explore the nature of the many linked processes for the full protocellular system. Obviously the simplicity of 
this simulation method combined with the inherent system complexity prevents us from expecting quantitative 
simulation predictions from these investigations. However, we report important findings on systemic processes, 
some previously predicted, and some newly discovered, as we couple the protocellular self-assembly processes 
and chemical reactions. For example, our simulations indicate that the container stability is significantly affected 
by the amount of oily precursor lipids and sensitizers and affect the partition of molecules in the container divi- 
sion process. Also a continuous supply of oily lipid precursors to the protocell environment at a very slow rate 
will pulse the protocellular loading (feeding) as oil blobs will form in water and whole blobs will be absorbed at 
one time. By orchestrating the precursor injection rate compared to diffusion, precursor self-assembly, protocell 
concentration, etc., an optimal size resource package can be spontaneously generated. Replication of the am- 
phiphilic genes is better on the surface of a micelle with a substantial oil core (loaded micelle) than on a regular 
micelle due to the higher aggregate stability. Also replication of amphiphilic genes (genes with lipophilic back- 
bones) in bulk water can be inhibited due to their tendency to form aggregates. Further the template directed 
gene ligation rate depends not only on the component monomers but also on the sequence of these monomers 
in the template. 
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I. INTRODUCTION 

The twilight zone that separates nonliving matter from life 
involves the assembly of and cooperation among different 
sub-components, which we can identify as metabolism, in- 
formation, and compartment. None of these ingredients are 
living and none of them can be ignored when looking at life 
as a whole. When assembled appropriately in a functional 
manner, their systemic properties constitute minimal life. 

Understanding the tempo and mode of the transition from 
nonliving to living matter requires a considerable effort of 
simplification compared to modern life. Cells as we know 
them in our current biosphere are highly complex. Even the 
simplest, parasitic cellular forms involve hundreds of genes, 
complex molecular machineries of energy exchange and in- 
tricate membrane structures |Q]]. Such modern organisms are 
presumably far away from the initial simple forms of cellular 
life that inhabited our planet a long time ago, whose primi- 
tive early cousins we are now attempting to assemble in the 
laboratory. 

Several complementary designs of protocells have been 
proposed that differ in the actual coupling between their var- 
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ious internal components ITol [Til [l7l l25l l2fjll . One partic- 
ularly important problem here, beyond the specific physical 
and chemical difficulties associated with the assembly of these 
protocells, is the problem of modeling the coupling of the pos- 
sible kinetic and structural scenarios that lead to a full cell 
cycle. None of the current proposed designs has yet been for- 
mulated in a full mathematical model that in a 3D simula- 
tion is able to generate the possible outcomes of a successful 
coupling between the three prime components: the genes, the 
metabolism, and the container. We believe that a physically 
well-grounded modeling approach can provide critical insight 
into what can be expected from a coupled set of structures and 
reactions, how the nano-scale stochasticity can jeopardize ap- 
propriate molecular interactions or even what are the effects of 
molecular information carriers in helping accurate replication 
to occur. In this paper we present such a minimal 3D model 
that in connection with ongoing experimental efforts is aimed 
at assembling and understanding a new class of nanoscale- 
sized protocells: the so called Los Alamos Bug. 

In the Los Alamos bug, the container is built of amphiphilic 
surfactants. Due to a their interaction with water, the surfac- 
tants spontaneously self-assemble into micelles with the hy- 
drophobic end of the surfactant molecules in the interior of 
the micelles and their hydrophilic ends in contact with the sur- 
rounding water. The interactions between the micelle and the 
other components of the Los Alamos bug, namely the photo- 
sensitizer, the genome, and the container precursors, allow the 
micelles to host these other components. 
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FIG. 1: Schematic of the life cycle of the Los Alamos Bug: The 
system consists of surfactants, sensitizers, and a biopolymer that acts 
as a genome (1). The surfactants spontaneously self-assemble into 
a micellar container within which the sensitizer resides while the 
biopolymer sticks at the surface of the container — this forms a com- 
plete protocell (2). Resources (genomic oligomers, sensitizers and 
surfactant precursors in the form of esters) are added to the system 
and get incorporated into the container (3). The existing information 
carrier acts as a template for supplied oligomers to hybridize and ef- 
fectively replicate the genome. Light energy is used to convert the 
surfactant precursor and the oligomer precursors into actual surfac- 
tant, oligomers and waste. The container grows as new surfactants 
are produced (5). Once the container reaches a critical size, it be- 
comes unstable and divides into two daughter cells. This completes 
the life cycle of the protocell (6). 



The genomic biopolymer (possibly decorated with hy- 
drophobic anchors) is also an amphiphile and due to the spe- 
cific nature of its interactions with water and the micelle, it 
will tend to reside at the surface of the micelle (see figure 
Q]2). The sensitizer is a hydrophobic molecule and will there- 
fore reside in the interior of the micelle. Once self-assembled, 
the protocell aggregate is "fed" with precursor molecules for 
the surfactants (oily esters), sensitizers and genomic precursor 
oligomers. As surfactant precursors are hydrophobic they will 
agglomerate inside the proto-organism and form a hydropho- 
bic core (figure [TJ 3). Light energy is used by the metabolism 
to transform precursors into new building blocks (surfactants 
and oligomers) of the protocell. The genomic oligomers 
that are complementary with particular stretches of the tem- 
plate strand will hybridize with it (figure [TJ4). The fully hy- 
bridized template/oligomers complex, which now only has hy- 
drophobic elements exposed, will move into the interior of 
the container where polymerization of the oligomers occurs 
followed at some later time by a random dissociation of the 
fully polymerized double-stranded genome into two single- 
stranded templates that move back to the surface. This process 
could also be enhanced by a gentle temperature cycle near the 
gene duplex melting point. 

As surfactant precursors are digested, the core volume of 
the protocell decreases while, at the same time, new surfac- 



tants are produced. The resulting change in the surface to 
volume ratio causes the micelle to become unstable (figure 
[TJ5), until it finally splits into two daughter cells (figure 1.6). 
Assuming that components of the growing parent micelle are 
appropriately distributed upon division, the two daughter cells 
will be replicates of the original organism, thus completing the 
protocell cycle. 

In the above setup, the container, genome and metabolism 
are coupled in various ways. Obviously, both the replication 
of the container and replication of the genome depend on a 
functioning metabolism, as the latter provides building blocks 
for aggregate growth and reproduction. In addition to that, 
the container also has a catalytic influence on the replication 
of both the metabolic elements and the genome: the micel- 
lar structure provides a compartment which brings precursors, 
sensitizers and nucleic acids in close vicinity, thereby increas- 
ing local concentrations and thus metabolic turnover. Fur- 
thermore, the micellar interface catalyzes the hybridization of 
the informational polymer with its complementary oligomer. 
Once the hybridized complex enters the "water-poor/free" in- 
terior of a micelle, the thermodynamics should change suffi- 
ciently to allow a dehydration reaction to occur whereby the 
oligomers become polymerized. Alternatively the water-lipid 
interface could either itself act as a ligation catalyst or the ad- 
dition of simple amphiphilic catalysts could facilitate the gene 
polymerization process. Last, but not least, the nucleic acid 
catalyzes the metabolism, which otherwise is extremely slow. 
A summary of the subsystem coupling is shown in Fig. [2] 
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FIG. 2: Functional coupling between container, metabolism and 
genome. Note how the gene catalyses (dashed arrows) the metabolic 
production (solid arrows) of both gene and container building blocks. 
The container ensures a high local concentrations (proximity) and fa- 
cilitates thermodynamic reaction conditions (dotted arrows) of both 
the metabolic molecules and the amphiphilic replicator polymers. 
The free energy is provided by light (his) and the provided resources 
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II. THE MODEL 



Dissipative particle dynamic s (DPP) is a mesoscale sim u- 
lation method introduced by Hoogerbrugge and Koel manl in 
1 19921 The method has been improved as a result of vari- 
ous theoretical support, revision, and expanded capabilities 
10, EH O, and has been applied to a number of biological 
systems such as membranes lfl2i l29Tl . vesicles ||3~II l32tl . and 
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micelles din. Also chemical reactions have been incorpo- 
rated into the DPD method 10. HI]. In the context of protocells, 
DPD has recently been applied to study a self-replicating mi- 
cellar system [9]. The DPD formalism used in this work is the 
revised version from Groot and Warren [13] that has become 
the de facto standard of DPD. 



A. Dissipative particle dynamics 

A DPD simulation consists of a set of N particles located in 
three-dimensional continuous space with Euclidean metrics. 
These particles are not individual atoms but represent several 
water molecules or beads in a polymer chain. Each particle i 
has a position r,, mass to; and momentum q;, from which one 
can derive its velocity v.; = q;/mi. Its motion is determined 
by a force field F; through Newton's second law of motion: 



d 2 Vj 

dt 2 
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The force acting on particle i can be decomposed into pair- 
wise interactions, which respectively are the sum of three dif- 
ferent components — a conservative, a dissipative and a ran- 
dom one: 



where F c , F D and F R are defined by 



-T]co (ry) (ny • Vy)ny 



(2) 



(3) 
(4) 
(5) 



For each particle pair ry = — Tj is the relative po- 



sition, fj 



Iryl the center-to-center distance, and 



v* — Vj the relative velocity. We denote with ny — L tJ/ , tJ 
the (unit) direction between the two particles. A detailed dis- 
cussion of the different forces now follows: 

The conservative force F^ is expressed in the usual way as 



the negative gradient of a potential 



Vij = V(nj). In 



most DPD simulations, a pure repulsive soft core potential of 
the form 



Vij(r) 







if r < r c 
otherwise 



(6) 



is used for all particle interactions, ay and r c are constants 
that define the strength and range of the particle interaction. 
The magnitude of the resulting force decreases linearly from 
| Fy (0) | = ay to |Fy(r c )| = 0. The ay's depend on the type 
of interacting particles — and are therefore the appropriate lo- 
cation to parameterize the model. In addition, different parti- 
cles pairs could be given different values of r c if one wants to 
effectively give particles different radii. However, in the cur- 
rent work, we choose r c = 1 for all bead interactions, which 
is the standard in almost all DPD simulations. 



For the study of information polymers and amphiphiles, in- 
dividual DPD beads can be covalently bonded. A bond be- 
tween bead i and bead j is formalized by an additional har- 
monic potential 







?*b) if are bonded 
otherwise 



(7) 



with bond strength b and range r&. In addition to that, we in- 
troduce a bending potential to stiffen longer polymer strands: 
In a chain i — j — k of interconnected polymer beads, the angle 
6j formed by the two bonds of the central bead j induces an 
additional harmonic potential 

Vf jk (6 j )= 1 -c ijk (e j -9 eg ) 2 , (8) 

where eq is the equilibrium angle and cy-fc denotes the 
strength of the bending potential. 

The dissipative force F^ is a function of the relative veloc- 
ity of the two particles. It models the viscous damping of the 
fluid. The friction coefficient r\ in eq. scales the strength 
of this force and oj d is a distance weighting function not de- 
termined by the general formalism. 

The random force, Fj^ accounts for thermal effects. It is 
scaled by a strength parameter a and a second weighting func- 
tion w R . £y is a Gaussian distributed random variable with 
(&(*)) = 0, (tf)> = {SikSji + 5u5 jk )5{t - t') and 

€ij = Sit- 
in order to reproduce the right thermodynamic behavior, 

the DPD formalism must satisfy the fluctuation dissipation 
theorem. As a consequence, the equilibrium state will obey 
Maxwell-Boltzmann statistics and therefore allows the deriva- 
tion of thermodynamic properties. As shown by Espanol and 
Warren [7], DPD satisfies the fluctuation dissipation theorem 
if and only if the weighting functions u> D and u> R obey the 
relation 



LU° = {iU R f. 

In agreement with the DPD standard, we set 



1 2 
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(9) 



(10) 



D 



F R 



acts like a thermostat 



If relation © is fulfilled, F 

to regulate the temperature of the system and the equilibrium 
temperature k^T is given by 

2 

UT=°—. (11) 

where kb denotes the Boltzmann constant. In molecular dy- 
namics simulations, a variety of thermostats have been ex- 
plored, but only the DPD-thermostat is guaranteed to con- 
serve linear and angular momenta of the particles and thus 
flow properties of the fluid (because all involved forces are 
central: Fy = — Fji). It is therefore the only thermostat that 
allows the study of transport processes l28ll . 

In agreement with the DPD standard, we use r c and k^T as 
our units of length and energy. All particles have unit mass 
rrii — 1. From equation (Q~|) we can derive the unit of time as 
t = r c y/m/kbT. We will give an estimate of the order of 
magnitude of the physical length in section Hill 
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B. Incorporation of chemical reactions 

We extended the DPD formalism to account for chemical 
reactions. The way that chemical reactions are implemented 
in our model is taken from Ono 12311 . where Brownian Dy- 
namics is extended with the same algorithm. 

Chemical reactions in our system occur between two reac- 
tants and fall into two different classes: 

transformation: X — ► Y 
polymerization: X + Y — ► XY 

Each reaction has a given rate for spontaneous occurrence k s . 

The spontaneous reaction rate can be enhanced by the pres- 
ence of nearby catalysts. The catalytic effect decreases lin- 
early with increasing distance to the reactant up to a cutoff 
distance r cat beyond which it is zero. For simplicity, the ef- 
fect of several catalysts is modeled as a superposition. Thus, 
the overall reaction rate is given as 

k = k s +J2fcat(rc) (12) 
c 

with 

/cat = { ^ ^ ' ^ ifr0<r «* (13) 

{ else 

In these equations, the sum runs over all catalyst beads, with 
re denoting the distance to the first reactant, r cat the maximal 
catalytic range, and k cat is the catalytic rate. Polymerization 
has the further restriction that the distance between the reac- 
tants must be less than a maximal reaction range R. To deduce 
probabilities from the reaction rates, we used an agent-based 
like algorithm that is given in appendix lAl 

If a reaction occurs, we change the particle types of the 
reactants from X to Y and/or establish or remove a bond be- 
tween the reactants, depending on the type of reaction. Parti- 
cle positions and momenta are conserved. 

We also introduced particle exchange into the model to 
mimic the supply of chemicals into the system, which drive 
it out of its equilibrium. Within a given region, particles of 
a certain class can be exchanged with a given probability to 
drive certain processes. Note that total particle number is kept 
constant. Likewise in chemical reactions, we conserve posi- 
tions and momenta when exchanging particles. 



C. Components of the minimal protocell model 

We model the protocell with the following components: 
water, surfactant precursor, surfactant, sensitizer, information 
templates, and information oligomers and their precursors. 
Water (W) and sensitizer (Z) are single DPD particles. Sur- 
factants are modeled as amphiphilic dimers: one hydrophilic 
head (H) and one hydrophobic tail particle (T) connected by 



a covalent bond. Precursor surfactants are dimers of two hy- 
drophobic particles (T — T). Interaction parameters (as mul- 
tiples of fcbT) for the water and amphiphiles have been taken 
from lITTll (where surfactants are modeled as dimers as well): 
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Bond parameters are b = 150fc;,T and r& = 0.5r c . These 
parameter values were originally used to analyze polymer sur- 
factant interactions. Later, the phase diagram for varying sur- 
factant concentrations was analyzed II 3 311 . 

In order to keep the number of different parameters as low 
as possible, we express further interactions with the same pa- 
rameters as the ones above: sensitizer beads are hydrophobic. 
Thus, their interaction parameters are equal to those for sur- 
factant tails: azj = a<Tj. 

Genes 

The gene is modeled as a strand of covalently bound 
monomers (A and B) with hydrophobic anchors (T) attached 
to it. We assume the gene is similar to a peptide nucleic acid 
(PNA) decorated with lipophilic side chains to the backbone. 
The reason why we are utilizing PNA and not DNA or RNA is 
because we want to have a non-charged backbone for the gene 
molecule to enhance its lipophilic properties. For details, see 
j26ll . We note that the use of PNA decorated with lipophilic 
side chains in conjunction with an amphiphilic surface layer 
will cause the genetic material to have a behavior that is quite 
different from that of DNA or RNA in water. In particular, it 
is not at all clear that the two complementary macromolecules 
locally will lie in a common plane when hybridized with each 
other. Thus we investigated a number of possible different 
orientations. 

By numbering the monomers within each strand, we in- 
troduce an orientation of the molecule that mimics the ori- 
entation of the actual peptide bond given by its C- and N- 
termini. This allows us to define the following vectors for 
each gene monomer bead: U; is a unit vector pointing from 
the previous monomer towards the current one. For the first 
monomer in the strand U; = 0. Likewise, v s ; is a unit vector 
pointing towards the next monomer in the strand (or for the 
last monomer). Zj is a unit vector pointing from the actual 
monomer towards its anchor bead. To obtain the association 
of PNA to the micellar surface, the molecule is modeled as in- 
terconnected amphiphiles. For the hydrophobic anchors, we 
use the same bead type T as used for the surfactants and pre- 
cursors, while nucleotide beads share the interaction param- 
eters of the hydrophiles: a\j = a^j = any- We need to 
introduce additional interactions that describe the affinity of 
complementary gene monomers. Due to the rather complex 
combination of hydrogen bond formation and cooperative and 
ir stacking between real gene monomers, we cannot expect the 
complementary monomer bead forces to be as simple as the 
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bead-bead interactions introduced in the former section. We 
now implement and test several alternative representations of 
such base affinities as discussed below. 

a. undirected attraction: The obvious extension of F£ 
to include attractive interactions is a combination of attractive 
and repulsive components. Thus, in the first representation, 
we replace F^ B (r) by the stepwise linear function 



F£ 1 B (r)=FX B (r) 



a 2 (r c 







r) n if r < r C2 
else 



(14) 



with r C2 > r c and a 2 < 0. Different attraction strengths a 2 
will be used and compared in later computer simulations (sec- 
tion llll PTt . To compensate strong attractions for small values 
of r, we will vary the repulsion strength a% = oab accord- 
ingly. Note that another generalization of F^ B compared to 
F^ B is the change in the interaction range which, in addition 
to the standard r c dependence, now also depends on the actual 
pair (A, B) through r C2 . 

b. directed "radial " attraction: In the real gene system, 
hybridization is partly due to the formation of H-bonds be- 
tween the complementary nucleotides. H-bonds share fea- 
tures with covalent bonds, which are better characterized by 
directed rather than radial interactions. Hence, in the second 
representation, we introduce directed attraction parallel to the 
A - T and B - T axes, respectively. Here, we replace F^ B 
by 



AB 



(r) 



a 2 (r C2 - r) (z • r)n if r < r C2 
[ else 

(15) 

with the above definitions for r, z, and n. Again, different 
attraction coefficients a 2 will be compared in the later simu- 
lations. The value ax = sab, on the other hand, can be held 
fixed because the attraction vanishes when r approaches 0. We 
set a\ = 35fcf,T = oaa = abb We call this interaction "ra- 
dial", because the strongest attraction will be radial towards 
the center of the micelle, once the PNA strand is attached to 
the surface of the micelle. 

c. directed "tangential " attraction: The third represen- 
tation is similar to the second, except that attraction is now 
perpendicular to the backbone and to the AT (or BT) axis. 
The force is attractive towards one side of the PNA and repul- 
sive towards the other — hence, it is the only implementation 
that catches the directionality of the molecule: 



*l 3 B (r) 



AB 



{ \ ( (u-\-v 



xz 

) xz 



r J n if r < 

else 
(16) 

This force is expected to be strongest tangential to the surface 
of the micelle. As in the last case, we will vary a 2 , but keep 
ax fixed at a value of 35/c;,T. 

Covalent bonds within PNA strands have a bond strength 
of b = 150fcf,T with an ideal bond length r& = 0.5r c for 
bonds between nucleotides and anchors, and Tf, = 0.75r c for 
bonds between the nucleotides themselves. In addition, we 
introduce stiffness (eq. [8) within the PNA strand: angles of 





a) radial attraction 



b) tangential attraction 



FIG. 3: Hybridization complexes for radial (a) and tangential (b) 
attraction between complementary bases. Bases are shown as black 
and white beads, hydrophobic anchors in yellow. Arrows denote the 
direction of strongest attraction. 



interconnected strands prefer to be stretched out (8q = 180°, 
= lOkbT). With the stiffness we model folding restric- 
tions of the peptide bond, as well as 7r-7r electron stacking of 
nearby nucleotides. This affects only the PNA chain, not the 
bonded hydrophobic anchors, as they do not experience any 
bending potential. Table Q] summarizes the chosen set of pa- 
rameters. 
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TABLE I: Interaction strength a.y (as multiples of ktT) for all bead 
types defined in the model. The force (*) between complementary 
nucleotides A and B has attractive parts and cannot be expressed by 
a single interaction parameter aab • Three different force fields have 
been considered for such interactions. See the text for details. 



Reactions 

For the above listed components we introduce the following 
chemical reactions: 

First, we define a reaction that transforms the oil-like pre- 
cursor surfactants into actual surfactants. In the real chemi- 
cal implementation of the protocell, the precursors are fatty 
rc scid esters. The ester bond of the precursor surfactant breaks 
thereby producing a fatty acid — the surfactant — and some 
small aromatic molecule — which is considered waste. Dis- 
regarding the production of the waste, we model this reaction 
by the scheme 



TT 



HT + Z 



(17) 



which reflects, that both parts of the ester are hydrophobic, 
while the resulting surfactant is an amphiphile. For simplic- 
ity, the spontaneous reaction rate is set to 0t _1 . The sensi- 
tizer acts as a catalyst with a catalytic radius of 1.0r c . In our 
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simulation, the catalytic rate of the sensitizer can be turned 
on (k cat = l.Or" 1 ) and off (Or -1 ) interactively by a switch. 
This mimics the photo-activity of the sensitizer. 

Second, we introduce reactions to form covalent bonds be- 
tween the terminal monomers of pairs of oligomers. 

A + B ► AB 

A + A — ► AA (18) 
B + B ► BB 

These syntheses are only applied to the terminal monomers 
in the PNA strands and involve no catalysts. The maximal 
range is 0.75r c , the maximal reaction rate is k max = O.lr -1 . 
The actual reaction rate between monomers i and j further 
depends on the orientation of the ligating strands: we set 

% = \k max • ^ + l) (19) 

This formulation also prevents covalent bonds between com- 
plementary strands (which are anti-parallel, and thus, have an 
effective k close to zero). 

III. RESULTS 



average of this result over the number of time steps was than 
histogrammed. We also observe a continuous exchange of sur- 
factants with the bulk phase. As a result of these associations 
and dissociations, we find a number of free monomers and 
sub-micellar aggregates in the bulk phase. These observations 
qualitatively fit theoretical and experimental results [see e.g. 
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We use the model discussed above to study various aspects 
of the life cycle of the Los Alamos Bug as depicted in fig- 
ure Q] In particular, our simulations address the spontaneous 
self-assembly of protocells (Fig. Q] frames 1&2), the incor- 
poration of resources (frames 2&3), the metabolic growth of 
the protocell (frames 4&5), template reproduction, and finally 
fission into two daughter cells (frames 5&6). We will further 
analyze some of the catalytic coupling processes explained in 
the introduction. 

All simulations are performed in three-dimensional space 
with periodic boundaries. We set a to 3 and r\ to 4.5, which 
leads to an equilibrium temperature of Ik^T. A total bead 
density p = 3. Or -3 is used for all simulations. System size 
and number of iterations is noted for each individual simula- 
tion run. We integrate equation ([U numerically with the DPD 
variant of the leapfrog Verlet integrator discussed in 1 1311 with 
A = 0.5 and a numerical step width of At = 0.04r. 



A. Self-assembly of micelles 

We initialize a cubic box of size (12.5r c ) 3 randomly with 
2.9 water beads and 0.05 surfactant dimers per unit volume, 
or 5664 water beads and 98 dimers in the box. Simulations are 
performed for Or < t < lOOOr with the interaction parame- 
ters summarized in Table U and the model parameters given in 
the introduction to this section. We observe the formation of 
spherical micelles with aggregation numbers up to about 20, 
with a peak around 12. This is shown in figure|4j where once 
the system had reached an equilibrium state, we followed its 
behavior. For each time step we recorded the number of ag- 
gregates of a particular aggregation number and hence the to- 
tal number of surfactants in the aggregates of that size. The 



FIG. 4: Micellar size distribution for a system containing 2.9 water 
beads and 0.05 surfactant dimers per unit cube. To obtain the ag- 
gregate size histograms from a system state, every two surfactants 
whose T-beads are separated by less than r c are considered to be- 
long to the same aggregate. 20000 systems states of an equilibrated 
system (200r < t < lOOOr) are averaged in the shown distribution. 

Although we do not intend to model specific chemicals, we 
can roughly estimate the order of magnitude for the physical 
length scale of our simulation, using a procedure proposed by 
Groot and Rabone 11211 . Our calculation is based on sodium 
alkanesulfates as these are well studied surfactants with prop- 
erties similar to the fatty acids used in the real chemical im- 
plementation. Table [TT] lists the critical micelle concentra- 
tion (CMC), i.e. the minimal concentration at which micelles 
spontaneously form. The table also gives the mean aggrega- 
tion number and the volume of these molecules. Under the 
simplifying assumption that all DPD beads have equal effec- 
tive volume, we can derive the molecular volume of a sin- 
gle DPD bead and - knowing the molecular volume of water 
(Vh 2 o = 30A ) - we get the so-called coarse graining param- 
eter 



N„ 



surf 



(20) 



H 2 



that tells us, how many water molecules are represented by 
a single DPD bead. The average number of DPD water 
beads per unit cube is p, each one of them representing N m 
molecules. Therefore, the physical length scale r c resolves to 



r c = (pN m Vn 2 o) 



1/3 



(21) 



We will work with solutions that are quite dilute and hence 
dominated by water. Noting that a liter of water has 
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surfactant 


CMC 


aggregation 


surfactant vol. 


N m 


r c 


surfactant cone. 


predicted 




in mol/1 


number 


in A' 




in A 


in mol/1 


micellization ratio 


NaCaHnSOi 


0.42 


17 ±6 


278 


4.625 


7.467 


0.201 


1 ■ 10" 5 


NaC 7 H 15 SOi 


0.22 


22 ± 10 


305 


5.075 


7.701 


0.183 


2.5 • 10" 3 


NaCsHnSOi 


0.13 


27 


332 


5.525 


7.923 


0.168 


0.2 


NaC 9 HwS04 


6.0 ■ 10~ 2 


33 


359 


5.975 


8.132 


0.156 


0.6 


NaCnH 23 S0 4 


1.6 ■ 10~ 2 


52 


413 


6.875 


8.521 


0.135 


0.935 


NaC 12 H2sSOi 


8.2 • 10~ 3 


64 ± 13 


440 


7.325 


8.703 


0.127 


1 


NaC 14 H29S0 4 


2.1 • 10" 3 


80 ± 16.5 


494 


8.225 


9.046 


0.113 


1 



TABLE II: Data for sodium alkanesulfate surfactants with varying tail length. For each surfactant, CMC and mean aggregation number are 
listed from 1 2]. The molecular volume is estimated from the numbern of carbon atoms using the formula V = 27(n + l)A 0] plus a constant 
88.5lA for the sulfate group (whose value is derived from the molecular mass (98.08g/mol) and density (1.84t;/cm 3 ) of sulfuric acid). The 
coarse graining parameter N m , the physical length scale r c , and the total surfactant concentration are the interpretation of model parameters 
in case that the model dimer represents the respective surfactant. Finally, the fraction of micellized surfactant is the prediction of the closed 
association model for the respective surfactant and the calculated concentration 1811 . 



1000/18 = 55.56 moles of water in it, while a volume of rj? 
has pN m molecules of water in it, we find that a concentration 
of 1 particle/rj? yields a unit of concentration as 



lr~ 3 = 55.56mol/pN m . 



(22) 



With these estimations, we find that the lipid concentra- 
tion in the above simulation represents between 0.11 and 0.20 
mol/l. It is somewhat arguable to estimate the concentration 
of free lipids in the bulk phase, because our simulations do 
not yield a sharp distinction between free lipids — i.e. sub- 
micellar aggregates — and proper micelles. Assuming that the 
most reasonable choice for such a distinction is the first min- 
imum in the micellar size distribution at aggregates of size 
5 or less, from figure [4] we get an average of 22.9 free sur- 
factants in the bulk phase out of 98 lipids in the total vol- 
ume, i.e. 76.6% of the surfactant is micellized and the free 
lipid concentration lies between 0.03 and 0.05 mol/l. Know- 
ing the physical surfactant concentration, we can compare 
this finding to the prediction of the closed association model 
JH]. According to this model, surfactants are either in bulk 
phase (S) or in micelles of aggregation number N (Sn)- With 
the pseudo-chemical reaction NS f± Sn and the condition 



that 



d[S] 



d[SU 



dN[Sn 



CMC 



d[SU 



CMC 



0.5, one can calculate 

the fraction of micellized surfactant for any total surfactant 
concentration [S'Jtotai = [S] + N[Sn}- The respective ratio 
A r [S' A r]/[S , ] total is also given in table HT1 

We find that our model best matches the aggregation num- 
bers of short chain surfactants (NaCeHi^SO^, while our 
micellization ratios more closely match the predictions for 
the somewhat longer chains (NaCgHigSOi). Although our 
model representation of surfactants as dimers is rather sim- 
plistic, we find a reasonable match (at least in the order of 
magnitude) between experiment, simulation, and theory. It 
should be noticed that the micellization parameters for fatty 
acids, which are the container surfactants of choice in the 
Los Alamos Bug, are qualitatively similar to the listed sodium 
alkanesulfate surfactant parameters, which are the most well 
studied surfactants in the scientific community. Given the easy 
availability of relevant parameters for alkanesulfate surfactant 



parameters and the level of coarse graining in our DPD model 
we can safely use these experimental data to calibrate our sim- 
ulation. It is conceivable that closer matches might be found 
by changing interaction parameters or the representation of 
surfactants. We have however decided to stick to the standard 
parameter set in order to get comparable results to earlier DPD 
simulations 

Next, we analyzed a ternary mixture of water, surfactant, 
and oil. In the system described above, we exchanged an ad- 
ditional 0.1 water beads per unit volume by 0.05 hydrophobic 
oil dimers (T — T), which represent the lipid precursors of the 
Los Alamos Bug. Starting from a random initial condition, the 
system forms loaded micelles: the precursors aggregate into a 
core in the interior of the individual micelles because of their 
high degree of hydrophobicity. This core is coated by surfac- 
tants, which shield it from water. We observe a stabilizing 
effect from the hydrophobic core: the rate of monomer dis- 
sociation from the aggregates decreases by a factor of 4 to 5. 
Dissociation of oil dimers does not happen during the simula- 
tions. Over the simulated time span (Or < t < lOOOr), these 
loaded micelles constantly fused to form bigger aggregates. 
At t = 250t, the system is composed of five micelles with 
aggregation numbers 12, 13, 16, 24, and 32, where the aggre- 
gation number just counts the surfactants in an aggregate and 
not any of the precursors or other components. At t = 500r 
we find four micelles (with sizes 16, 24, 25, 32) and finally, 
for t = lOOOr, the system consisted of only two micelles with 
aggregation numbers 43 and 53. It remains unclear, whether 
this was the equilibrium solution, or whether the two micelles 
would finally fuse to form a single aggregate. It is known that 
any given mixture of surfactants and oil in water results in 
some equilibrium aggregate structure, some useful and some 
less useful as a protocellular container substrate, see e.g. the 
recent summary discussion in J2lll . 

In general, the addition of hydrophobic precursors allows 
aggregates to grow far beyond their micellar aggregation num- 
ber, while at the same time, monomer dissociations from the 
assembly falls by a factor of four or more. This is consis- 
tent with simulation results from earlier studies of a similar 
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surfactant-precursor- water system 01 . However, a more sys- 
tematic DPD investigation is necessary to address the dynam- 
ics, stability, and size distribution issue in this context. 



B. Self-assembly of the protocell 

In this section, we study the self-assembly of protocells. We 
initialize a cubic box of size (7.5r c ) 3 with 1212 water parti- 
cles, 21 surfactant dimers, 4 sensitizer particles and one PNA 
strand that is four nucleotides in length. All other simulation 
parameters are as before. Using these numbers, we achieve 
the same overall particle density and the same surfactant con- 
centration as in the previous section. 

Starting from an arbitrary initial condition, we observe the 
spontaneous formation of a protocell, i.e. a micelle that is 
loaded with sensitizer and which has PNA attached to its sur- 
face and whose nucleotides are exposed to the aqueous phase 
(see figure[5]). Aggregation happens within a remarkably short 
period: after only 10 time units, we already find complete pro- 
tocells. The lipid aggregation number of this micelle is around 
14 with few associations and dissociations of monomers. The 
slight increase in aggregation number along with a decrease of 
monomer dissociations is most probably due to the stabilizing 
effect of the additional sensitizers. 



C. Replication of the Container 

The dynamics of a surfactant-precursor-water system sim- 
ilar to the one under consideration has been studied in detail 
in 0. Considering precursor and surfactant kinetics, the for- 
merly analyzed system differs from the one discussed here 
in that i) the catalytic role of sensitizers is performed by the 
surfactants themselves, and ii) the metabolic turnover is not 
regulated by turning the light on and off, but instead only 
follows chemical mass kinetic s. Using simul ati ons ba sed on 
classical lattice gas methods, ICovenev et al.1 in [l996 repro- 
duced the micellar sel f-replication experiments of Bachmann 
et al. JH. In ll998l and 2000 Mayer and Rasmussenl developed 
an extended lattice polymer approach for explicitly including 
polymers and chemical reactions similar to the current DPD 
approach and they were also able to reproduce the experimen- 




FIG. 5: Self-assembly of the protocell from a random initial condi- 
tion. The diagrams show the state of the system at times a) t = Or, 
b) t — 4t, and c) t — lOr. Surfactants are shown in green (head 
bead) and yellow (tail), the sensitizers in red, the PNA backbone in 
yellow and the PNA monomers in black and white. 



tal findings by Luisi's group [3]. The purpose of this section is 
to show that the reported dynamics also hold for the metabolic 
reaction scheme of the Los Alamos Bug. 

A system of size (10r c ) 3 is initialized with a micelle con- 
sisting of 15 surfactants and loaded with 4 sensitizer beads in 
its interior. Model parameters are given in the beginning of 
this section. In a single spherical region of radius 2r c located 
away from the micelle, pairs of water particles are replaced 
by surfactant dimer precursors with an overall exchange rate 
of w 2.5 x 10~ 3 precursors per time unit. 

Because of their hydrophobic nature, the precursor 
molecules tend to agglomerate into oil-like droplets. The dif- 
fusion of such droplets becomes progressively slower the big- 
ger they are. This initiates a positive feedback: the bigger 
the droplets, the more slowly they diffuse out of the exchange 
region. The slower they diffuse, the more likely they are to ac- 
cumulate additional precursors before they diffuse out of the 
exchange volume. By varying the volume of the exchange re- 
gion and/or the rate of exchange, one can set the mean size 
of the precursor droplets that are formed. Due to the positive 
feedback, the effect will not be linear with either the exchange 
region size or the exchange rate. 

Since we do not want the non-continuous exchange events 
to disturb the systems dynamics too much, we restrict particle 
exchange to a region of 2.07- c (3% of the total system volume). 
By varying the exchange rate used to introduce precursors, we 
find that 5.0 x 10~ 5 is close to the optimum for which droplets 
of precursor molecules are provided at a reasonable rate, yet 
are still small enough to diffuse at a reasonable speed. With 
these values, the precursor droplets consisted of 5 dimers on 
average. Once in the vicinity of a micelle, the droplets are 
immediately absorbed. 

When the micelle absorbs 15 precursor molecules into its 
interior, we stop supplying additional precursors and trig- 
ger the catalytic activity of the sensitizer by turning on the 
light. During the metabolic turnover, the micelle grows in am- 
phiphile number, while losing few, if any, amphiphiles due to 
the stabilizing effect of the remaining precursors as was dis- 
cussed previously. It responds to the changing surfactant to 
precursor ratio by changing its shape from spherical to rod- 
like. The elongation continues until nearly all the precursors 
are metabolized. At some moment, the elongated aggregate 
becomes unstable and divides into two daughter cells (see fig- 
ure [6}. With the parameters used, overall precursor turnover 
and fission takes place in approximately 20 time units (i.e., 
500 time steps). 

We compared the above findings to simulations of an unreg- 
ulated system, where the precursor supply and catalytic rate 
are not triggered, but instead held constant over the whole 
simulated time span. The objective behind this simulation 
was to find whether the system might feature inherent self- 
regulation: as the precursor forms droplets in the bulk phase, 
their incorporation into the micelle occurs in spurts rather than 
continuously. If the introduction rate of precursors into the 
system is locally fast enough to allow larger droplets to form 
(especially due to the positive feedback effect), a larger num- 
ber of precursors can simultaneously enter the protocell. Then 
if the metabolic turnover rate is sufficiently fast, the turnover 
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FIG. 6: Replication dynamics of the container: precursors are fed 
into the system far from the micelle at the (periodically reflected) 
edge of the system space. They form droplets in the aqueous phase, 
which are absorbed by the protocells as a whole. Protocells grow by 
incorporation of precursors. After a critical amount of precursor is 
transformed into surfactant, the assembly loses its stability and splits 
in two daughter cells (right frame). 



of the large number of precursors might be sufficient to trigger 
container division rather than having a slow but continual loss 
of newly formed amphiphiles. 

To investigate this possibility, we performed simulation 
runs for a system of size (10r c ) 3 initialized with a micelle of 
15 surfactants and 4 sensitizer beads. Other model parameters 
are the same as given in the beginning of this section. Precur- 
sors were supplied by the same mechanism and rate as before. 
We observed the incorporation of droplets between 3 and 9 
precursor dimers in size. As the transformation of precursors 
happened significantly faster than the precursor supply, nearly 
each droplet was transformed separately. When only few pre- 
cursors were absorbed at once (i.e. a small droplet), the mi- 
celle responded by rejecting several surfactants into the bulk 
phase. Such loose surfactants then formed sub-micellar ag- 
gregates or attached to precursor droplets when present. How- 
ever, when the incorporated droplet was big enough, the out- 
come of the metabolic turnover was a proper cell division. A 
micelle that consisted of 15 surfactants and 4 sensitizers, for 
example, split in two after the absorption and turnover of 8 
precursors. The fission products were two micelles, one with 
14 surfactants and 3 sensitizers and the other with 9 surfac- 
tants and 1 sensitizer. 

This result suggests that the explicit regulation of the 
metabolic turnover by light bursts might not be necessary to 
obtain the replication cycle of the container as a similar regu- 
lation can be obtained by a careful regulation of the provided 
precursor droplet sizes. Light control might, however, still 
serve as a convenient mechanism to synchronize container and 
genome replication if they occur on separate time scales. 



D. Replication of the genome 

In our experience, the most difficult component of the pro- 
tocell to model with DPD methods is the genome and its be- 
havior. Furthermore, the DPD hybridization process seems 
more illdefined than the ligation process, which is why our 
discussion of the replication of the genome is divided in 
two consecutive steps: hybridization and ligation. Please re- 
call that hybridization denotes the alignment of short PNA 
oligomer sections along the template PNA strand and "hydro- 



gen" bonding to it, while ligation — or polymerization — is the 
reaction that turns aligned oligomers into an actual (comple- 
mentary) copy of the template. 



1. Hybridization 

Replication of the genome essentially depends on the stabil- 
ity of the hybridized complex: it can only occur if the double 
strands are stable for a time long enough for all the needed 
oligomers to diffuse to and align with the template. It should 
be noted that if more than 2 oligomers are involved, the join- 
ing of additional oligomers and their polymerization can oc- 
cur sequentially so the unpolymerized templates need not all 
be simultaneously attached. As will be shown further below, 
once some polymerization has occurred, that section will be 
more stable in hybridized form. We studied the stability of 
the hybridization with the following simulation: A system of 
size (5.5r c ) 3 was initialized with an oil layer that is meant to 
mimic a two phase system (single beads of type T are con- 
fined to lie below a plane above which the water is located). 
The overall particle density is p = 3r~ 3 , as in the earlier 
experiments, in order to make the hybridization process as 
simple as possible. As we shall see later, aggregate surfac- 
tant dimers tend to tangle with the gene anchors, which both 
slows down the hybridization process and makes it less accu- 
rate. A four-monomer long PNA template was placed at the 
oil-water-interface with its anchors pointing down toward the 
oil and its bases pointing up towards the aqueous phase. A pair 
of 2-nucleotide long complementary oligomers was placed at 
a distance of 0.5r c from this strand at a location/orientation 
for proper hybridization. The location/orientation was varied 
to match the different hybridization cases studied. In the case 
of directed radial attraction, this meant that all the beads of 
the complementary PNA molecules are outside the interface 
plane, with their hydrophobic anchors pointing away from the 
hybridization site. In contrast, in the case of tangential attrac- 
tion, both the template and the oligomers span the interface 
region as shown in figure [7] 

In the system modeled, we only had two different types of 
monomers (A, B) with A and B being complementary to 
each other, but not self-complementary. All different 4-mer 
templates excluding symmetric configurations were used (e.g. 
AAAA, AAAB, AABB, ABAB, and ABBA) and for 
each different template only the proper complementary dimer 
oligomers were used. The different 4-mer configurations can 
differentially hinder the ability of the complementary bases to 
slide along the template. 

During the simulations, the distances between all four com- 
plementary base pairs were measured at every time step. 
When one of these distances exceeded 1.5r c (the maximal in- 
teraction range for complementary bases), the PNA strands 
were considered to be dehybridized. The time it took for the 
double strands to dehybridize — i.e. the association time of 
the hybridized complex — serves as a measure of the stability 
of that state. After a maximum of t = lOOr, simulations were 
truncated and the hybridization was considered to be stable. 

For the three different representations of PNA hybridization 
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undirected attraction 



FIG. 7: Initial setup of the hybridization simulations. The system 
is initialized with an oil-layer that mimics the oil-water interface of 
a two-phase system. A four-mer template and two complementary 
dimers are placed at the interface so that they form a hybridization 
complex. The association time of such hybridization complexes is 
measured for different PNA implementations and attraction forces. 
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(see sections lll C\ Genes, cases a,b, and c), we performed sim- 
ulations for all possible combinations of four bases exclud- 
ing symmetrical combinations. Strengths for attractive forces 
were set with respect to the repulsive force parameter oab 
so that complementary bases attracted each other but did not 
overlap by more than 0.6r c . The association times were mea- 
sured using 10 to 20 runs for each combination. Results are 
shown in figure [8] 

a. undirected attraction: In the case of undirected at- 
traction, we found mean association times between 2.12r for 
01 = 50k b T, a 2 = -lOfaT, and 7.76r for a x = 65k b T, a 2 = 
— 20fcf,T. For strong attractions, association times tended to 
increase with the number of equal (preferably nearby) nu- 
cleotides in the template (AAAA is the most, while ABBA 
is the least stable sequence). However, these differences were 
rather small. 

b. directed radial attraction: For directed radial attrac- 
tion, the mean association times ranged from 0.45r for a 2 — 
-10k b T to 0.98t for a 2 = ~30k b T (oi = a AB = 35fc b T 
for all cases) without any significant variation for different se- 
quences. For most simulation runs, it took only a few time 
steps for the initial complex to dehybridize. The reason for 
the poor nature of the hybridization of the PNA for the radial 
attraction is quite obvious: due to the amphiphilic character 
of PNA, the strands will arrange so that nucleotides point to- 
wards water and the anchors towards oil. Thus, the attraction 
is directed perpendicular to the oil-water interface and into 
the aqueous phase where the oligomers do not want to reside. 
Because of the dot product in equation ( TT3T >, the attraction be- 
tween two PNA molecules on the interface is marginal and the 
association time is essentially a matter of diffusion. 

c. directed tangential attraction: In contrast to the other 
tested situations, in the case of directed tangential attraction, 
one can see significant differences in the association time of 
the initial hybridized complexes, provided the attraction is 
strong enough: for gene sequences with pairs of equal bases at 
terminal positions (e.g. AAAA and AABB), hybridization 
is usually less stable than for sequences without equal bases 
at terminal positions (ABBA and ABAB). The association 
time of sequences with only one such dimer lies between the 
values of the above two situations. Examination of the sim- 
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FIG. 8: The association times (i.e. the time until the initially hy- 
bridized complex becomes dehybridized) for different PNA template 
sequences of length four using a) undirected, b) radial, and c) tan- 
gential attraction. For each implementation, three different attraction 
strengths are compared, as given in the legend for each figure. a± 
denotes the coefficient of the repulsive part, £12 the coefficient of the 
attractive part of the interaction force. In the case of directed attrac- 
tion (b and c) a± was set to 35fcj,T independent of the respective 
value of a,2- In c), the plotted averages are minimal values for the 
actual averages, as simulations were truncated at t = 100r. If runs 
were truncated, the multipliers above that run designate how often 
this was done. 



ulations reveal the cause of this trend: a continuous group of 
two or more equal monomers, one of which is a terminal po- 
sition of the template allows the attached dimer to slide along 
the template strand without a strong penalty in potential en- 
ergy, and eventually protrude beyond the end of the template. 
In this misaligned configuration, the dimer can easily distort 
from the parallel alignment, thereby reducing the overall at- 
traction to the template, until it finally disassociates from the 
complex. Distinct bases at terminal positions, on the contrary, 
prevent this sliding along and then off of the template, thereby 
significantly stabilizing the hybridized state. 

For the more promising PNA implementations — undirected 
and tangential attraction — we further measured the mean dis- 
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tance between complementary bases (hybridization distance) 
and the distance between those bases in the oligomers that 
are supposed to polymerize (ligation distance). We performed 
these measurements using the sequence AAAA for the undi- 
rected, and ABBA for the tangential attractions (interaction 
parameters are given in the caption of figure |9}. Simulations 
are performed for Or < t < lOOOr. The resulting time series 
are shown in figure [9] 

In the case of tangential hybridization one finds two alter- 
nating domains in the hybridization distance time series: (i) 
when oligomers are aligned to the template, the mean hy- 
bridization distance is around 1.04r c with only small fluctua- 
tions and an average ligation distance of 1.01r c (e.g. 430r < 
t < 450r and 700r < t < 780t in figure 0. In between 
such periods, (ii) oligomers dissociate from the template, and 
diffuse over the interface, which is indicated by the large vari- 
ance in hybridization distance. 

Undirected attraction, in contrast, yields hybridization dis- 
tances around 1.07r c with significant continual fluctuations 
and a mean ligation distance of 1.158r c . One cannot observe 
the "locking" of the hybridized state that is apparent for the 
tangential attraction: although the oligomers preferably stay 
in the vicinity of the template, they are not forced into any 
particular orientation. Investigation of simulation states re- 
veals that oligomers align along different sites of the template 
or even cross the template strand. Thus, although it appears 
from a quick look at figure [9] that the undirected attraction 
performs better on average, it is only during the "locked in" 
period that the desired reactions occur. We can therefore con- 
clude that only the implementation of PNA using tangential 
attraction is able to generate a proper hybridization and base 
recognition approximation. 
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FIG. 9: Mean hybridization (upper panel) and ligation distance 
(lower panel) for the PNA templates (and corresponding oligomers) 
AAAA using undirected attraction with ai = 65fc;,T, = 
—20kbT (red) and for ABBA using tangential attraction with ai = 
35k v T,ci2 — AQkbT (blue). By hybridization distance, we mean 
the average distance between complementary nucleotides, by liga- 
tion distance, we mean the minimal distance between two terminal 
nucleotides that are supposed to polymerize. The maximal values of 
the various distances are limited by the small size of the box. 



It is assumed that the PNA replication is catalyzed by the 
oil-water or surfactant-water interface. This is because: (i) 
lipophilic PNA concentrates at the oil-water interface and thus 
obtains a much higher local concentration there than in wa- 
ter; (ii) that the interface contains a lower water concentration 
than the bulk phase; (iii) that the interface might directly act 
as a catalyst for the amide bond formation; and (iv) that the 
PNA is more spread out (linear) when attached to the inter- 
face. To test the geometric part of this hypothesis, we also 
performed simulations of hybridization in pure water. We ran- 
domly initialized a box of size (5.5r c ) 3 with water, PNA tem- 
plate (ABBA) and complementary oligomers using directed 
tangential forces (the overall bead density was p — 3r~ 3 ). 
Simulations were performed for Or < t < 1000t. Hy- 
bridization and ligation distances are plotted in figure[T0] The 
mean hybridization distance in this scenario is 1.41r c (which 
is close to the maximum radius r C2 at which attraction of com- 
plementary nucleotides still exists) with a standard deviation 
of 0.34r c . Moreover, there is no clear separation between hy- 
bridized and dehybridized states. In contrast to the scenario 
for the oil-water interface, the oligomers never completely 
dissociate from the template. However, the oligomers are not 
properly hybridized either. Instead, the template and comple- 
mentary strands mainly attract each other due to the hydropho- 
bic interactions between the tail beads of these strands rather 
than forces between their bases. Inspection of the simulated 
states shows that oligomers are seldom aligned parallel to the 
template. The overall structure has more resemblance to that 
of a micelle with geometries defined by the amphiphilic prop- 
erties of the molecules, rather than a double strand defined by 
base affinities. The ligation distance has an average value of 
1.12r c with a standard deviation of 0.39r c . Unfortunately, this 
is smaller than in the previous simulations. This might result 
in ligation rates higher than those on the surface. However, 
if we decide to vary the ligation probability depending on the 
angle between PNA backbones, the effective ligation rate is 
smaller than at the oil-water interface. 

Last but not least, it is notable that we cannot achieve re- 
liable hybridization without a stiffness potential in the PNA 
chain. In the absence of such stiffness, complementary bases 
within one strand tend to bind to each other and form sharp 
hairpin loops even for very short strands. This effectively hin- 
ders any proper hybridization except for very few sequences 
that do not offer any possibility for loop formation (such as 
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FIG. 10: Hybridization and ligation distances of PNA template and 
complementary oligomers in water. For PNA, tangential directed at- 
traction with a,2 — 40fcbT has been used. The nucleotide sequence 
is ABBA. 
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2. Ligation 

To study the polymerization reaction, a four-mer template 
strand and two complementary dimers are placed randomly on 
the surface of a loaded micelle (20 surfactant, 20 precursors) 
within a system of size (10rj?) and total density p = 3.0r~ 3 . 
As the last section identified ABBA to form the most sta- 
ble hybridization complex, we restrict polymerization exper- 
iments to this particular sequence using the PNA representa- 
tion with tangential directed attraction (see figure fTTb. 

Of the performed simulations, 8 out of 10 generated proper 
template directed ligation, while the remaining 2 reactions oc- 
cur spontaneously in the absence of the template strand and 
define the expected background reaction |[22ll . In our simula- 
tions, one of the two spontaneous ligation results was a correct 
complementary copy of the template strand while the other 
was not. Note that in our simulation, polymerization has not 
been explicitly restricted to happen only between C- and N- 
terminals, which means that both ends can be concatenated 
with any other end. When ligation is template directed, 6 
out of 8 runs lead to correct complementary sequences, while 
the other two resulted in mispairings of the form BABA. In 
summary, we find that correct replication is about 50% more 
reliable, when directed by the template. If one prohibits the 
ligation of equal terminal beads (C-C and N-N), the reliability 
of replication is expected to further increase. 

The simulations reveal that it can take a surprisingly long 
time for the oligomers to form a ligated hybridized complex 
with the template. Ligation occurs after 90r in the fastest and 
after 674r in the slowest run. The average time is estimated as 
223. 2t. The huge variance is due to the random walk of tem- 
plate and oligomers over the surface of the micelle. Compared 
to the oil-water interface of the previous section, oligomer 
motion is further slowed down by the head particles of the 
amphiphiles as well as the dimer structure of the aggregate 
building blocks. 

It is worth mentioning that as expected, the hybridized com- 
plex is significantly more stable after the ligation has occured 
than before. None of the hybridized complexes that formed in 
the above simulations showed any sign of dissociation within 




FIG. 11: The three steps of template directed replication: a) Tem- 
plate (ABBA) and oligomers (BA and AB) diffuse over the sur- 
face of the micelle, b) oligomers form a hybridization complex with 
the template strand, and c) oligomers polymerize to yield a comple- 
mentary copy of the template. 



750 time units after ligation took place. 



E. Full protocell division 

The last elemental step in the life cycle of the Los Alamos 
Bug is the fission of the grown cell into two daughter cells 
as shown in figure [12] In addition to what was discussed in 
section [Til CI here we studied the fission of the whole pro- 
tocell after the replication of its genome, that is, a micelle 
loaded with some lipid precursors, sensitizers and two com- 
plementary PNA templates. The objective is to illuminate how 
templates and sensitizers are distributed among the daughter 
cells. Although not addressed by simulations in earlier sec- 
tions, here the influence of the number of sensitizers is also 
investigated. 

Proper division into two daughter cells requires the melting 
of the double stranded PNA resulting from genome replica- 
tion, which may be achieved by a temperature cycle. In the 
DPD formalism, temperature translates into the interaction pa- 
rameters fly . To study the impact of a temperature cycle on 
the whole system, one would need to exchange the interaction 
parameters between all DPD beads. For simplicity in these 
initial investigations, and in the absence of a rigorous cali- 
bration of our model, we chose to invoke melting by simply 
turning off the attractive hybridization interactions between 
the PNA bases. 

We performed simulations of a system of size (10r c ) 3 with 
an initial protocell consisting of 20 surfactants, 20 precursors, 
4 to 8 sensitizers, and two PNA template strands randomly lo- 
cated on its surface. Otherwise, the standard parameters listed 
in the beginning of this section were used. Snapshots of the 
system are shown in Fig. [T2]ln all cases, metabolic turnover 
initiated the division of the aggregate at times of between 50 
to 100 t after the start of the simulation. Fission times were 
found to be longer than in the former experiments. This was 
because the aggregate consisted of more particles and because 
the template strands stabilized the rod-like aggregate that pre- 
cedes protocell division. It was observed that PNA strands 
were preferably located along the elongated part of the ag- 
gregate, rather than at the caps. We believe that due to the 
stiffness parameter (eq. ([8])) of the PNA strands, the aggregate 
tends to elongate in a direction that is parallel to the PNAs 
long axis. 

Using only 4 sensitizers, the distribution of sensitizers and 
PNA among the daughter cells was rather diverse: in one out 
of 10 runs, all sensitizers and templates remained in one of 
the fission products, while the other consisted of only 1 1 sur- 
factants. In 7 of the runs the partition was nearly even: both 
sensitizers and templates were equally distributed among the 
two daughter cells, which differed in aggregation number by 
at most 3 surfactants. Last but not least, we also observed two 
runs where the other components were distributed equally, but 
one of the daughter cells contained both template strands. We 
note that although it was not observed, it might be possible 
for a template to connect two otherwise divided aggregates by 
attaching to both their surfaces. 

One might expect the equipartition of sensitizers is more 
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FIG. 12: The division of the whole protocell completes the life cy- 
cle of the Los Alamos Bug. A mature protocell is loaded with pre- 
cursor molecules, sensitizers, and two complementary PNA strands. 
During the metabolic turnover of precursors, the aggregate elongates 
and divides. Both PNA strands and sensitizer molecules tend to dis- 
tribute evenly among the daughter cells, when only few sensitizers 
are present. 

likely when their number is increased. Our simulation results, 
however, showed quite the opposite: protocells loaded with 8 
sensitizers instead of 4 almost always responded by rejecting 
an average of 11 to 12 surfactants. By doing so, the proto- 
cell was able to maintain a stable spherical shape even with 
an aggregate number of 27 surfactants. This is due to the col- 
lective stabilizing effect of the strongly hydrophobic core of 
sensitizers within the aggregate. The more sensitizers that are 
added, the more they will tend to stick together. The more they 
stick together, the less likely they will partition into different 
daughter cells. Thus they are better able to stabilize the am- 
phiphilic dimers in the aggregate. For an initial protocell that 
holds 6 sensitizers, proper division can still be observed, but 
the results are less reliable than in the case of 4 sensitizers. For 
6 sensitizers, equipartition of sensitizers was only achieved in 
one out of five simulations. The other runs lead to empty mi- 
celles or a situation where one of the daughter micelles has 
only one sensitizer bead. Equipartition of PNA could not be 
achieved for the cases with either 6 or 8 sensitizer beads. 



IV. DISCUSSION 

Because of the inherent simplifications of the aggregated 
DPD simulation technique and due to the inherent complex- 
ity of our protocell system, accurate predictions of neither 
the detailed kinetic nor thermodynamic properties could be 
expected. However, insights into generic issues and likely 
system behavior could be obtained by the illumination of the 
systemic properties of the proposed protocell design. In par- 
ticular we were able to see how the global behavior emerges 
from the simple and well-defined properties of the underlying 
molecular ingredients. Interpolation between several simula- 
tion methods combined with experimental data is necessary 
to obtain predictive understanding of this protocellular sys- 
tem. Investigations based on quantum mechanics, molecular 
dynamics, reaction kinetics, combined with these and other 
DPD studies, hopefully can address the quantitative predic- 
tion issues in a more complete manner Ii 24l1 . 

We found that the micellar kinetics that underlie the 
container replication are highly affected by hydrophobic 
molecules present in the solution. In the design of the 



Los Alamos Bug, these hydrophobic molecules can be the 
metabolic precursors and sensitizers. As these molecules are 
incorporated into the protocell, they form a core that sta- 
bilizes the aggregates. Such loaded micelles have a larger 
aggregation number than micelles in a pure surfactant-water 
system, and the surfactant exchange with the bulk phase is 
strongly decreased. The simulations thus suggest that a 3- 
component (ternary) surfactant-oil-water system is more suit- 
able for yielding a suitable container than a two-component 
system based on surfactant and water only. 

We also observed that protocells grow in spurts rather 
than continuously, even with a continuous supply of resource 
molecules. This is because the oil-like precursor molecules 
form droplets before they are absorbed by the aggregates. Fur- 
thermore, due to slower diffusion of larger objects, once the 
droplets start to form, volume-wise they will tend to grow ever 
more rapidly the larger they become prior to being absorbed. 
The spurt-like support of resources might be sufficient to initi- 
ate the division process of the aggregate if these droplets have 
the appropriate size. If so, the system would be self -regulated 
and no further triggering of the metabolism as with an ex- 
ternal light switch would be necessary. Whether or not this 
self-regulation enables a reliable replication of the whole or- 
ganism also depends on a number of other factors such as the 
rate of precursor supply compared to the replication rate of the 
genome. Further simulation investigations will be necessary 
to identify whether the metabolic self-regulation is sufficient 
when the precursor supply rate is not carefully balanced. 

Our representations of the biopolymer that stores genomic 
information can be considered to be the crudest feature of the 
model. None of the implementations relate in detail to the ac- 
tual physicochemical traits of the real PNA molecule. The be- 
havior of the PNA molecule with hydrophobic side chains in 
our protocell is also found to be quite different from that seen 
for DNA or RNA in water. Unlike DNA where hybridized 
base pairs are radially opposite, in our PNA the hybridized 
bases are more likely to line up side by side in our attempts to 
model them. Furthermore, we have not been able to achieve 
an appropriate modeling of the balance between the hydro- 
gen bond formation and the ir stacking between the bases in 
large part due to the hydrophobic and amphiphilic elements 
involved. More work and new ideas are needed here. How- 
ever, we believe that the most fundamental properties of the 
biopolymer used — a PNA strand decorated with hydropho- 
bic anchors that is able to hybridize with another PNA strand 
via H-bonds — is captured, at least in a qualitative manner. 
Against the background of this caveat, two findings are of 
particular interest: the simulations reveal that even our simple 
template representations are sufficient to introduce an impact 
on the stability of the hybridization complex. In other words, 
it is observed how a molecular fitness function emerges from 
very few assumptions about the underlying molecular imple- 
mentation. Furthermore, this fitness function is not a simple 
superposition of the individual monomer properties, but rather 
depends on the sequence of nucleotides in the genome. 

Equipartition of the components among the daughter cells 
after the division was achieved only when a few hydropho- 
bic sensitizers are present in the protocell. Above a minimal 
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number of sensitizers, equipartition becomes less probable as 
the number of sensitizers is further increased. This counter- 
intuitive finding is connected to the fact that sensitizers, like 
precursors, form a hydrophobic core in the interior of the mi- 
celle, thereby increasing the allowed size of stable aggregates, 
in addition to stabilizing them overall. Since the stability of 
the core itself increases with its size, once large enough, it 
becomes nearly impossible for the core and therefore the pro- 
tocell as a whole to divide. Instead, the instability caused by 
the excess surfactants is addressed by rejecting excess indi- 
vidual surfactants one at a time. The results suggest that the 
volume of the sensitizer molecules most likely will affect the 
fission dynamics when a certain threshold is reached. 

Many open questions about systemic issues are still left 
unanswered by these initial investigations. The main open is- 
sues include: (i) What is the effect of heating the whole sys- 
tem in order to de-hybridize the gene templates? Obviously, 
the lipid aggregate has to be more heat tolerant than the gene 
duplex, (ii) What is the effect of defining the gene duplex as 
the photo-catalyst as in the originally proposed protocell de- 
sign B26I1 ? In our simulations, the sensitizer has been assumed 
to do the photo-fragmentation without any genetic catalysis. 
Also, what is the effect of having the sensitizer as a separate 
molecule (as reported here) versus covalently linking it to the 
gene, e.g. as one of the lipophilic anchors? (iii) What is the 
effect on the overall protocell replication if both the gene pre- 
cursors (oligomers) and the lipid precursors are supplied to 
the solution and have to diffuse to the protocell? In such a 
case, will we see the coordinated gene and container growth 
based on reaction kinetics predicted by Rocheleau et al. l27ll ? 
As gene replication is necessary before container division for 
two viable daughters, can that be ensured in other ways than 
through a sequential resource supply? (iv) What new issues 
arise when the protocell goes through more than one gener- 
ation of its life cycle, e.g. due to complementary resource 
sequence supplies? 

Subsequent work in this area must also relate the DPD sim- 
ulation implementation in this publication and its dynamics 
with corresponding molecular dynamics simulations 113011 and 
reaction kinetics studies 11611 as well as experimental findings 
as they arise. 



V. CONCLUSION 

The overall replication dynamics that constitute the life cy- 
cle of the Los Alamos Bug was implemented using DPD sim- 
ulations. In particular, we investigated the dynamics of con- 
tainer, metabolic complex, and genome subsystems, as well as 
the mutual interaction between these individual components. 
Component diffusion, self-assembly, precursor incorporation, 
metabolic turnover, template directed replication of the gene, 
and finally the protocellular division were studied in vari- 
ous simulations. The main systemic finds are: (a) Metabolic 
growth orchestration can be coordinated by a switchable light 
source and/or by a continuous light source together with reg- 
ulation of the size and frequency of the oily precursor pack- 
age injection, which was not anticipated, (b) As anticipated, 



there is a tradeoff between the lipophilic strength of the ge- 
netic backbone that makes it stick to the aggregate and its 
ability to easily hybridize with a complementary string, (c) 
As anticipated, for PNA with hydrophobic side chains, three 
dimensional structure formation that can potentially inhibit 
appropriate hybridization is more likely in water than at an 
oil-water or lipid-water interface, although this is in part also 
dependent on the type pf hybridization attraction, (d) Gene 
replication is easier at the surface of a micelle with a substan- 
tial oil core than for a micelle with a little or no oil core. The 
larger the oil core is, the easier the gene replication becomes 
due to the aggregate stability and the ability to have a linear 
template, (e) As anticipated, the stability of two full comple- 
mentary gene strings is much higher than a gene template and 
two complementary unligated gene pieces, (f) Rather surpris- 
ingly we observe that the template directed replication rate 
is dependent on the monomer component sequence and not 
only on the monomer component composition, (g) Partition 
of lipids, sensitizers, and gene between daughter cells strongly 
depends on the size of the oil core. The smaller the oil core 
is, the more balanced the partition becomes, which was not 
anticipated. 

These systemic findings are now being considered in the 
experimental designs being pursued as part of the Protocell 
Assembly (PAs) and Programmable Artificial Cell Evolution 
(PACE) collaborations and their validity will eventually be ad- 
dressed as the experiments are executed. 
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APPENDIX A: ALGORITHM FOR CHEMICAL 
REACTIONS 

Between every two DPD time steps, the following algo- 
rithm is applied to perform chemical reactions: For every re- 
action scheme, we successively check all possible pairs of re- 
actants A, B, and compare their effective reaction rate k to 
a number taken from a suitably normalized pseudo-random 
number generator. If the reaction rate is smaller than this 
value, we perform the reaction and go on to the next pair of 
possible reactants. However, A and B will not be considered 
again in this step. The exact algorithm — notated in the Python 
programming language — reads as follows: 

shuffle ( reaction_list ) 

for reaction in reaction_list : 

for A in space . particles ( reaction . educt_A) : 

if reaction . is_synthesis : 
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# if reaction is a synthesis, possible 

# reaction partners are particles 

# of type educt_B in the vicinity of A. 
partners = A. neighbors ( 

reaction . educt_B, reaction . R 

) 

else : 

# otherwise, possible reaction partners 

# are particles of type educt_B bonded to A. 
partners = A. bonded (reaction . educt_B) 

for B in partners : 

# compute effective reaction rate 



k = reaction. k 

for C in A. neighbors ( 

reaction . catalyst, reaction . r_cat 
) : 

k += reaction . k_cat * 

( 1- (A . pos-C . pos ) . length ( ) / reaction . r_cat ) 

if random ( ) < dt * k : 

# perform reaction 
react (A, B, reaction) 

# and leave loop over partners 
continue 
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